AD-A264  QPQ  :UMENTATION  PAGE 

I  Illff'I  lift  l!l!«  ■»!<  - - - — - 


1  AGENCY  u-ac.  ■ 


4  TITLE  AND  SUBTITLE 


«ated  lo  aver age  1  noos  *,<;  ; 

cfieciion  Of  information  L'enO  ccn^’u-r  .• 
rs  1'fn.ico‘i.  Direcfcratofor  irf.^rmriK  •*. 1': 
;.j.  ~-r.  f  »r:**c!  *9704  Oio*M  ww.—'  :•■  - 

2  REPORT  CATE 

March  l'jy.'i 


I’r..!'.  •  «•  •: s  •  ’ 


NEURAL  ADAPTIVE  SENSORY  PROCESSING  FOR  UNDERSEA  SONAR 

6  AUTHOR(S) 

S.  L.  Speidel 

7  PERFORMING  ORGANIZATION  NAME(S)  AND  ADORESSiES) 

Naval  Command,  Control  and  Ocean  Surveillance  Center  (NCCOSC) 

RDT&E  Division 

San  Diego,  CA  92152-5001 


PR:  SUHfi 
PE:  ;N 

W l ; :  DNiifiNUiiT 


6  PCBFOfA 

REPORT  f 


9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

Office  of  Chief  of  Naval  Research 
Code  01122,  OCNR-20T 
Arlington,  VA  22217-5000 

11  SUPPLEMENTARY  NOTES 


DTIC 

ELECTE  | 
HAT  2  7  19931 


10  SPONSOR  NC.  M'Y.  TOa  NG 
AGENCY  NwMbLR 


12a  Of STRIBUT1  ON/A VAI LABILITY  STATEMENT 


12b  DISTRIBUTION  CODE 


Approved  for  public  release;  distribution  is  unlimited. 


13.  ABSTRACT  (Maximum  200  words) 

Neural  adaptive  beamformers  (NABFs)  utilize  neural  paradigms  to  accomplish  desired  adaptions  that  are  associated 
with  sensory-field-responsive  partitioning  and  selection  processes.  Kohonen-type  organization  and  Hopfield-type  optimiza¬ 
tion  have  been  formulated  as  NABF  mechanisms  and  have  been  applied  to  test  data.  Formulations  and  results  are  included. 
NABF’s  are  also  used  in  conjunction  with  a  learning  network  for  interpretation  of  weight  sets  as  population  codings  of 
direction.  An  example  is  included.  Finally,  desirable  qualities  of  human  auditory  response  are  being  interpreted  in  the  con¬ 
text  of  neural  adaptive  beamforming  for  the  purpose  of  creating  an  integrated  processing  structure  that  incorporate^ 
NABF’s,  a  cochlear  model,  and  an  associative  memory  as  part  of  a  total  spatio-temporal  processing  scheme  for  selective 
attention. 


93-11953 

■MUD! 


Published  in  IEEE  Journal  of  Oceanic  Engineering ,  Vol.  17,  No,  4,  Oct  1992,  pp  341-350. 


14  SUBJECT  TERMS 


signal  processing  neural 

network  neural  networks 

beamforming 


17  SECURITY  CLASS! F CATION 
OF  REPORT 


18  SECURITY  CLASSIFICATION 
OF  THIS  PAGE 


|  19  SECURITY  CLASSIFICATION 
i  OF  ABSTRACT 


15  NUMBER  OF  PAGES 


16  PRCE  CODE 


20.  UMiTATfGN  Of  A8S  TRACT 


UNCLASSIFIED 


UNCLASSIFIED 


UNCLASSIFIED 


SAME  AS  REPORT 


NSN  7540-01 -260  S500 


Standard  form  298  (FRONT) 


U-H  It  H  K\-M  t  M  t  )i 


Neural  Adaptive  Sensory  Processing  for 

Undersea  Sonar 

Sleveil  i  Spenlel 


Abstract — Neural  adaptive  beamfornters  ( NABFs)  utilize  neu 
ral  paradigms  to  accomplish  desired  adaptions  that  are  associ 
ated  with  sensory-field-resporisive  partitioning  and  selection 
processes.  Kohonen-typc  organization  and  Hopfield-type  opti 
mization  have  been  formulated  as  NABF  mechanisms  and  have 
been  applied  to  test  data.  Formulations  and  results  are  in¬ 
cluded.  NABFs  are  also  used  in  conjunction  with  a  learning 
network  for  interpretation  of  weight  sets  as  population  codings 
of  direction.  An  example  is  included.  Finally,  desirable  qualities 
of  human  auditory  response  are  being  interpreted  in  the  context 
of  neural  adaptive  beamforr.-iog  f<r  the  purpose  of  creating  an 
integrated  processing  structure  that  incorporates  NABFs,  a 
cochlear  model,  and  an  associative  memory  as  part  of  a  total 
spatio-temporal  processing  scheme  for  selective  attention. 


1.  Background 

DURING  the  course  of  this  work  on  sensory  process¬ 
ing,  information  from  the  studies  of  biological  sys¬ 
tems,  including  psychosensory  experiments  on  humans  [1], 
[2]  detailed  physiological  studies  [3],  and  detailed  neuro¬ 
logical  studies  of  animal  sonar  systems  [4],  has  guided  the 
philosophical  and  architectural  orientation  of  the  develop¬ 
ing  computational  system.  In  this  respect,  studies  of  vert- 
ibrate  auditory  systems  are  germane  to  our  work  with 
regard  to  ocean  sonar  systems,  both  passive  and  active. 
For  example,  it  is  important  to  acknowledge  that  a  desir¬ 
able  object  orientation  is  supported,  in  the  neurobiologi- 
cal  context,  by  the  integration  of  the  computation 
products  of  different  nuclei  of  the  brain  through  their 
interaction  along  afferent  and  efferent  propagation  chan¬ 
nels.  This  often  includes  the  integration  of  different  sen¬ 
sory  modalities,  but  even  within  a  single  modality  there  is 
considerable  interactive  integration.  For  example,  cogni¬ 
tive  auditory  function  encompasses  the  simultaneous  acts 
of  1)  attending  memory  according  to  stimuli,  2)  attending 
stimuli  according  to  memory,  and  3)  attending  stimuli  and 
memory  according  to  an  ongoing  thought  process. 

What  does  an  auditory  system  do?  One  answer  is  that 
the  auditory  system  creates  the  perception  of  individu¬ 
ated,  recognizable  sounds  out  of  the  ongoing  composite 
excitation  received  at  the  two  ears.  To  perform  this  func¬ 
tion  upon  the  excitation  of  man-made  sensor  arrays  using 
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a  man-made  computet  is  the  goal  ul  die  work  being 
teported  here 

It  has  long  been  known  that  beamformmg  i.>  useful  lor 
aiding  the  interpretation  of  composite  excitations  m  sonar 
applications  where  multiple  sounds  are  simultaneously 
incident  upon  sensory  arrays.  What  do  beumformers  do'’ 
Beamformers  are  devices  that  exhibit  nonuniform  re¬ 
sponse  across  ranges  of  dimensions  of  the  stimulus  field 
It  is  suggested  that  tins  is  an  important  function  of  the 
neurons  of  our  brains  and  of  cognitive  sensory  processors 
in  general,  including  those  that  are  man-made.  In  these 
devices,  the  focussing  of  the  beamformers  is  to  be  di¬ 
rected  by  providing  exemplary  excitations  of  interest  (as 
from  memory').  Hence  these  devices  are  to  perform  adap¬ 
tive  beamforming  and  adjust  their  ranges  of  maximum 
response  according  to  the  content  of  the  sensory  field. 

Augmented  Kohonen-style  organization  and  Hopfield- 
style  optimization  processes  have  been  used  to  perform 
“neural”  adaptive  beamforming  [5-8],  where  “neural”  is 
being  used  in  the  sense  of  the  popular  metaphor.  The 
NABF  paradigms  easily  integrate  the  qualities  of  atten¬ 
tiveness  and  binding  when  they  are  applied  to  parti¬ 
tioned/segmented  sensory  excitation.  A  brief  overview  of 
these  two  principal  neural  beamforming  schemes  is  given 
here. 

1)  The  crossbar  bcamfomier  is  based  on  the  Hopficid 
crossbar  circuit.  In  its  simplest  form  the  crossbar  adaptive 
beamformer  (CABF)  can  be  described  as  an  adaptive 
combiner  or  mixer  [9]  with  an  augmented  crossbar  net¬ 
work  of  graded  response  and  associated  regulator  embed¬ 
ded  as  processing  kernel  elements  1 1 0]  A  method  for 
controlling  a  crossbar  circuit  is  derived  from  Wiener  opti¬ 
mal  least-squares  filtering  principles.  The  regulator,  cross¬ 
bar  network,  and  adaptive  combiner  arrangement  func¬ 
tion  to  selectively  attend  task-relevant  sensory  excitation 
from  within  a  total  excitation  which  includes  nonrelcvant 
components.  Adaption  of  the  network  response  occurs 
without  explicit  computation  of  the  error  between  the 
output  and  the  exemplar.  This  is  possible  because  hypoth¬ 
esis/memory-driven  exemplars  are  supplied  at  the  input. 
The  CABF  circuit  form  allows  an  organized  implementa¬ 
tion  in  dedicated  VLSI  hardware.  Simulations  suggest 
convergence  in  merely  a  few  time  constants  of  th:  hard¬ 
ware  devices. 

2)  The  multivector  adaptive  beamformer  (MABF)  is  re¬ 
lated  to  Kohonen  feature  map  learning.  It  docs  a  statisti¬ 
cal  segmentation  of  the  stimulus  field,  independent  of  any 
exemplars,  and  it  can  serve  as  an  element  of  continually 


0364-9059 /92S03.00  ©  1992  IEEE 


ii  t  J  !' 


modifiable  associative  mcmoiv  or  a  dassilicr;  it  is  torimi 
l*ltcd  in  terms  of  multidimensional  geometrical  calculus 
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resulting  in  the  simplified  expressions 

Re  {y}  =  £  w,x, 


-£|(E -a,) -'■]}• 


For  simplicity  it  is  assumed  here  that  the  quadrature  input  is 
available.  Other  cases  have  been  considered  elsewhere  (Speideh  1990). 
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A.  Crossbar  Beanifonner 

Key  elements  of  the  beainforming  structure  (Fig.  I )  arc: 
1)  activity  representing  propagation  of  time  and/or  phase 
lagged  excitation  to  the  adaptive  synapses,  2)  a  model  that 
relates  conductivity  and  current  flow  at  the  synapses  to 
measures  of  time  averaged  coexcitation  of  afferent  activi¬ 
ties  with  each  other  and  measures  of  correlation  of  affer¬ 
ent  activity  with  exemplars/ memories,  3)  the  crossbar 
kernel,  and  4)  multiplicative  connections  from  the  cross¬ 
bar  kernel  outputs  (c,)  onto  the  combiner  portion  of  the 
CABF. 

The  CABF  minimizes  the  mean  square  error  (£)  be¬ 
tween  the  real  part  of  the  beamformer  output  (y)  and  the 
exemplar  (h)' 
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Fig  1  "ITur  crossbar  adaptive  bciimforrm  r 


where  E  denotes  the  expectation.  Let  the  beamformer 
weights,  (w,),  be  related  to  the  Hopfield  circuit  output 
voltages  according  to 
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A  direct  relationship  is  established  between  the  magni¬ 
tude  of  £  and  the  value  of  the  Hopfield  energy  function, 
H  (a  Lyapunov  function  associated  with  the  crossbar 
circuit).  This  relationship  is  established,  in  practice,  by 
applying  a  controller  (the  black  box  labeled  “membrane 
model”  in  Fig.  1)  to  the  crossbar  arrangement.  The  re¬ 
quired  behavior  of  the  controller  can  be  derived  analyti¬ 
cally  by  equating  £  and  H  and  solving  for  the  currents 
and  connectivities  of  the  crossbar  as  functions  of  the 
excitations  (*,). 

The  crossbar  circuit  which  is  illustrated  in  Fig.  1  is 
shown  in  more  detail  in  Fig.  2.  The  circuit  is  simply  a 
number  (A)  of  charge  flow  control  devices,  each  con¬ 
nected  with  variable  efficacy  to  each  of  the  others.  Hop- 
field  has  shown  that  if  the  connections  are  symmetric 
( Ttj  =  Tjj)  and  direct  feedback  is  zero  (7/  =  Of  then  the 
circuit  will  converge  to  stable  states  representing  the  local 
minima.  These  conditions  on  the  connectivity  are  suffi¬ 
cient  but  not  always  necessary,  dependent  on  the  applica¬ 
tion.  It  is  convenient  to  choose  the  simplified  form  of  the 
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Fig.  2.  Hopfield  crossbar  arrangcmenl. 


Lyapunov  function 
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Hopfield  and  Tank  (1985)  use  this  form  when  they  oper¬ 
ate  at  the  high-gain  limit.  However,  here  the  primary 
motive  is  to  simplify  the  expression  for  <p.  The  behavior  of 
the  neglected  term  of  <p  (as  a  function  of  the  gain)  is  TqX 

examined  elsewhere  (8).  The  term  can  be  neglected  with-  - 

out  much  sensitivity  to  the  gain  ( g )  in  this  application.  $  CRA&I 
It  is  convenient  to  set  the  doubled  modified  Lyapunov  3  TAB 
function  equal  to  the  mean  square  error  it)  minus  the  '>noi mrerf 

mean  square  exemplar  (h).  The  quantities  to  be  equated  ’ibcation _ 
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The  resulting  expression-*  tor  the  connectivities  and  eur- 
rents  arc 

1 

connectivity  /,;(j)  -  --  j  v) >/)  ,h).  /„  (I 
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(«) 

ion  flow  l,  —  J  v,(  77 ) /i (  rj  F  r  -  /)  dp 

1  r> 

x,(p)x,(p)dp 

(9) 

where  r  is  a  response  latency  period,  and  i/(r  -  r)  is  the 
output  amplitude  at  the  end  of  the  previous  epoch  from 
the  / th  element.  In  the  case  of  discrete  time-step  simula¬ 
tions,  the  expectation  value  is  usually  evaluated  by  sum¬ 
ming. 

The  voltage  controlled  feedback  to  the  input  current, 
(9),  is  equivalent  to  a  direct  negative  feedback  connection, 
Tu  ¥=  0,  if  the  value  of  the  voltage  feedback  term  of  the 
current  is  allowed  to  vary  during  equilibration  of  the 
crossbar  circuit.  However,  in  the  present  implementation, 
the  current  is  fixed,  i.e.,  the  controller  (membrane  model) 
clamps  the  current  during  equilibration.  Thus,  there  are 
in  effect  two  time-scales:  the  epoch  scale  and  the  scale  of 
dynamic  equilibration  of  the  crossbar  arrangement  (mem¬ 
brane  dynamics)  in  which  the  time  increment  is  chosen  to 
be  a  fraction  of  the  device  (cell)  time-constant. 

Fast  and  compact  analog  electronic  implementations  of 
(8)  and  (9)  may  be  used  to  compose  the  controller  and 
crossbar  circuit.  The  specific  circuits  would  be  the  “leaky 
integrators,”  differentiators,  and  hardware  convolvers  that 
have  been  pivotal  advancements  in  the  area  of  neural 
hardware  [12],  [13]. 

B.  Multivector  Beamformer 

The  action  of  the  MABF  is  pictorially  represented  in 
Fig.  3.  During  learning,  memory  formation  is  accom¬ 
plished  according  to 

W™*  =  W°u  +  a(xt  -  c,) Ad;  (10) 

where  W  is  the  “weight  plane,”  expressed  as  a  bivector,  of 
the  r th  processing  element  (PE)  of  a  layer  of  PE’s  that 
receives  a  fanout  of  the  inputs,  a  is  the  learning  rate, 
dik  =  *k  ‘  ^°ld  (the  bar  denotes  normalization),  cik  is  the 
projection  of  xk  onto  Wit  normalized  to  unit  length,  and 
the  yinbul  A  denotes  the  wedge  product  [11].  For  the 
case  of  temporal  learning,  the  input  vectors  are  formed 
from  a  tapped  delay  line. 

III.  NABF  Application 
A.  Sonobuoy  Deployment  Test  Case 

The  performance  of  the  CABF  was  validated  against 
composite  sounds  of  a  real  sonar  scene  impinging  upon  a 
spatially  complex  array.  The  data  were  obtained  from  the 
Sonar  Thinned  Random  Array  Program  (STRAP).  Fig.  4 
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Fig.  4.  The  deployment  pattern  for  the  sonobuoys. 

depicts  the  spatial  arrangement  of  1 1  sonobuoys  that  were 
dropped  in  the  Atlantic  Ocean.  A  known  source  was 
active  at  a  distance  of  approximately  10  mi.  It  consisted  of 
two  frequencies,  seven  and  eleven  Hertz.  Figs.  5  and  6 
show  spectral  densities  from  various  channels.  Notice  the 
inconsistency  across  the  channels. 

The  temporal  recordings  made  at  these  buoys  were 
played  into  the  beamformer.  Fig.  7  shows  the  adapted 
sensitivity  of  the  CABF  as  a  function  of  time.  The  CABF 
is  correctly  attending  the  desired  signal  at  approximately 
41°.  Though  the  beamformer’s  attention  is  occasionally 
distracted,  these  results  are  very  good  when  you  consider 
that  no  spectral  preprocessing  was  performed,  i.e.,  the 
desired  signal  was  still  mixed  with  the  other  interfering 
components  at  a  level  of  approximately  -20  dB  with 
respect  to  some  higher  frequency  components  (Fig.  5). 

Currently,  more  tests  are  being  performed  with  interfer¬ 
ing  signals  arriving  at  other  directions.  Fig.  8  shows  what 
happens  when  an  identical  signal  is  simultaneously  com 
ing  in  at  zero  degrees  and  at  approximately  the  same 
sound  level.  Notice  how  the  attention  of  the  beamformer 
flips  alternately  back  and  forth  between  41°  and  0°  az¬ 
imuth.  This  is  not  always  undesirable  bul  in  some  contexts 
it  makes  the  simultaneous  tracking  of  multiple  sources  of 
the  same  kind  difficult. 

B.  Fast  Computation  via  Digital  Hardware 

Two  digital  signal  processing  boards  are  currently  being 
utilized  to  achieve  real-time  performance  for  combined 
wavelet  processing  (as  in  Aussel,  1989)  and  adaptive 
beamforming.  These  boards  include  analog  input/output 
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Fig.  5.  Spectral  density  for  the  STRAP  data  set,  channel  1.  It  is  plotted  on  three  different  frequency  scales. 


daughter  boards  with  built-in  active  filters.  Each  board 
gives  two  input  channels  with  simultaneous  sampling  to 
400  kHz  at  12  bits  resolution  which  is  sufficient  for  sonar 
processing,  and  a  floating  point  DSP  with  performance  at 
50  million  floating-point  operations  per  second 
(MFLOPS).  With  the  two  boards  (100  MFLOPS  perfor¬ 
mance)  the  processor  operates  in  real  time.  The  clocks  of 
multiple  boards  are  slaved  together  to  give  simultaneous 
sampling  on  all  channels  and  communication  busses  be¬ 
tween  the  boards  allow  true  parallel  processing. 

In  experiments  using  an  array  of  microphones  operated 
in  a  laboratory  room,  the  CABF  was  able  to  locate  a 
sound  source  while  rejecting  other  interfering  sounds  in 
the  laboratory.  For  these  experiments,  notes  were  played 
on  musical  instruments  in  a  high  noise  background. 
Recognition  in  this  case  was  pitch  dependent.  Future 


experiments  will  explore  pitch-independent  recognition  of 
musical  instruments. 

IV.  Population  Coding  and  Decoding  of 
Direction 

In  the  case  where  a  set  of  sensors  have  been  deployed 
in  a  manner  that  results  in  unknown  or  inaccurately 
known  positions,  or  when  environmental  conditions  pro¬ 
duce  multiple  propagation  paths,  special  techniques  arc 
required  for  associating  directions  with  array  phase  pat¬ 
terns.  As  a  concept  demonstration,  a  neural  network  was 
trained  to  associate  direction  sines  with  patterns  of  the 
phase  of  sensory  excitation.  Us  performance  was  tested 
with  both  regularly  spaced  and  randomly  dithered  arrays. 

One  advantage  of  using  this  technique  is  that  when 
training  a  network  the  output  direction  values  can  be 
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Fig.  6.  Spectral  densities  for  three  different  channels.  Scale  was  expanded  to  accentuate  the  frequency  region  of  interest. 


referenced  to  a  landmark  or  platform  frame  not  necessar¬ 
ily  associated  with  or  related  to  the  array  geometry  (al¬ 
though  the  array  geometry  will  certainly  affect  the  net¬ 
work’s  performance).  Also,  if  direction  to  target  is  used 
for  training,  rather  than  direction  of  incidence,  then  the 
direction  association  network  (DAN)  forms  a  propagation 
model  as  it  learns  its  associations.  Compensation  for 
near-field  effects  (non-planar  wavefronts  across  the  aper¬ 
ture  of  the  array)  can  also  be  attained. 

The  patterns  to  be  input  to  the  DAN  can  be  generated 
by  an  NABF  weight  generation  process  such  as  the  CABF 
which,  in  this  case,  functions  as  an  encoder  of  features  of 
incoming  signals,  A  useful  neural  computing  cluster  can 
thus  be  formed  out  of  the  CABF  and  a  DAN.  The  cluster 
integrates  the  direction  finding  and  adaptive  beamforming 
functions  (see  Fig.  9).  The  azimuth  and  elevation  may  be 
output  as  direction  sines/cosines.  In  the  autonomous 


vehicles  application,  the  output  can  be  a  control  vector 
instead  of  the  azimuth  and  elevation  outputs  shown. 

A  DAN  that  uses  the  backpropagation  algorithm  to 
learn  to  associate  directions  with  beamformer  weight  pat¬ 
terns  has  been  implemented  and  tested  on  a  digital  com¬ 
puter.  It  was  found  that  a  subnet  can  be  trained  to 
perform  well  at  associating  beam  weights  with  directions 
even  though  the  sensor  positions  are  randomized  and  not 
known  to  the  subnet. 

The  network  consists  of  three  layers.  There  are  18 
processing  elements  in  the  first  layer,  four  elements  in  the 
second  layer,  and  one  element  in  the  output  layer.  The 
choice  of  the  number  of  hidden  units  was  made  by  doing 
exploratory  runs  with  fewer  and  more  units,  compromising 
between  performance  gain  and  the  computation  load.  The 
minimum  number  of  hidden  units  was  used,  which  was 
sufficient  to  give  good  performance  on  the  training  sci. 


Ill  I  It  KN  M  lil  '  H  1  A'- It  i 


»ngle<dg) 


1126. 


1192. 


1298. 


t  i*ie 

(ns ) 


1924  . 


1990. 


1456 


ISIS. 


1 t» 
<«s) 


1581 


1647 


1713. 


anglftfdg) 


Fig  7.  Mapping  of  beamformer  sensitivity  as  a  function  of  time  and  angle  for  the  STRAP  data  sei  Pic  desired  signal  is 

coming  in  at  approximately  40°  azimuth 


The  network  is  given  full  layer  to  layer  connectivity,  i.e., 
every  laver-2  element  is  connected  to  every  layer- 1  de¬ 
ment  and  the  single  layer-3  element  is  connected  to  every 
tayer-2  element. 

Training  inputs  to  the  first  layer  arc  generated  by 
simply  computing  beamforming  weights  (similar  to  what 
would  be  output  by  an  RCN)  for  each  angle  of  a  set  of 
test  angles  that  span  -90  to  +90°.  Resulting  patterns  are 
stored  in  a  file,  then  they  are  fed  to  the  network  repeat¬ 
edly  during  training  with  appropriate  sines  of  the  training 
angles  presented  simultaneously  as  output  values  for  the 
network.  The  backpropagation  training  algorithm  that  was 


used  closely  follows  the  derivation  by  Rumelhart  ct  oi 

(141. 

Training  and  test  data  were  synthesized  for  the  follow¬ 
ing  array  geometries:  1)  the  training  set  is  generated  as  il 
the  sensor  array  is  a  regular-spaced,  linear  array  with  half 
wavelength  spacing  between  the  sensors,  and  2)  the  train¬ 
ing  set  was  generated  with  the  sensor  positions  dithered 
by  amounts  determined  by  a  pseudorandom  number  gen¬ 
erator  and  within  a  wavelength  of  nominal  regular  spacing 
as  in  case  1)  (see  Fig.  10).  In  both  cases  the  training  sets 
have  entries  every  5.0'  while  the  test  set  used  to  generate 
the  error  plots  has  entries  every  1.0". 


Fig.  11(a)  and  (b)  show  error  summaries  for  the 
regular -spaced  array  and  with  100  and  1000  training 
passes,  respectively.  Note  that  in  the  training  beyond  100 
passes  the  network  continues  to  improve  performance  at 
the  near  end-on  incidence  angles  to  the  detriment  of 
performance  at  the  array  broadside  angles.  One  way  to 
alleviate  this  problem  would  be  to  leave  the  data  for 
end-on  angles  out  of  the  training  for  linear  array  geome¬ 
tries. 

Fig.  12(a)  and  (b)  are  similar  to  Fig.  1 1(a)  and  (b)  except 
that  the  training  and  test  data  are  relevant  to  the  ran¬ 
domly  dithered  array.  The  same  overtraining  effect  is 
evident  here  as  was  discussed  relative  to  the  regular 


spaced  array  case.  The  array  is  a  nearly  linear  array  sin;  t 
the  sensor  locations  were  only  dithered  by  plus  or  mir  is 
one  half  wavelength. 

V.  Cognitive  Sensory  Processing 
A.  Architectural  Overview 

A  conceptual  overview  and  some  key  elements  of  a 
comprehensive  processing  scheme  are  depicted  in  Figs.  13 
and  14,  respectively.  The  conceptual  overview  is  a  simpli¬ 
fied  representation  that  includes  four  principal  functions: 
1)  the  partitioning  function,  such  as  the  partitioning  done 
by  the  cochlea  or  retina  or  computational  correlates  of 
these,  2)  the  selection  function,  such  as  the  adaptive 


Fig.  9.  Structure  for  simultaneous  beamforming  and  interpretation  of 
weights  as  a  population  coding  of  direction. 
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Fig.  10.  Sensor  positions  for  the  dithered  array  (distances  represented 
as  fractions  of  a  wavelength  from  nominal  linear  configuration) 

focussing  provided  by  adaptive  beamforming  or  filtering, 
3)  position  and  motion  determination,  and  4)  recognition. 
In  reality,  these  functions  are  not  performed  separately. 
They  are  provided  by  the  interacting  elements  of  the 
processing  scheme. 

The  key  elements  of  the  processing  scheme  are:  1) 
multiple  band  filtering,  2)  binaural  correlation,  3)  spadal 
attentional  focussing,  and  4)  temporal  attentional  fo¬ 
cussing/recognition.  In  practice,  multiple-band  filtering 
corresponding  to  the  cochlear  filtering  indicated  in  Fig.  14 
is  performed  using  a  scaled-wavelet  formulation,  provid¬ 
ing  a  spread  of  bandwidths  associated  with  the  various 
best-frequencies  (15].  It  is  recognized  that  this  model 
cannot  account  for  the  sharpness  of  the  cochlear  response 
function  near  the  best  frequency  [16]. 

Spatial  mappings  have  been  observed  in  the  colliculus 
in  some  vertebrates  [17],  (18]  and  in  cortical  field  AI  of 
the  cat.  Intermediate  between  the  cochlear  processing 
and  the  bandwise  spatial  mapping  (Fig.  14)  is  an  adaptive 
spatial  process  (not  depicted)  provided  by  the  NABF.  The 
darkened  areas  represent  those  attended  (emphasized)  by 
the  NABF.  Thus  the  spatial  layer  acts  as  a  sieve,  passing 
attended  stimulus  partials. 

The  spatial  mappings  can  be  related  to  the  beamformer 
sensitivity  maps  of  Figs.  7  and  8  where  a  single  row  of  the 
spatial  map  as  a  function  of  time  is  plotted  contiguously 
down  the  page.  These  maps  represent  the  activity  of  a 
layer  of  beamformers  with  relatively  fixed  directional  pref¬ 
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Fig.  11.  The  results  of  processing  test  data  with  the  DAN  after  it  has 
been  trained,  (a)  The  DAN  was  trained  with  100  passes  through  the 
training  set,  and  (b)  the  DAN  was  trained  with  1000  passes.  The  DAN 
has  four  hidden  units;  the  array  is  regularly  spaced. 


erences.  What  is  the  purpose  of  forming  a  topological 
organization  of  the  beamformers?  The  topological  map¬ 
ping  creates  an  organization  by  which  the  cells  for  the 
various  auditory  bands  which  respond  to  a  given  object  in 
space  are  close  together.  Tins  simplifies  the  projection  of 
the  output  of  the  spatial  neurons  to  the  cortical  area 
where  recognition  is  accomplished.  This  organization  is 
also  beneficial  in  the  digital  signal  processing  application, 
though  the  units  are  not  actually  arranged  spatially  but 
are  arranged  by  ordinal  number  instead. 

Notice  that  a  loop  has  been  formed,  because  the  output 
of  the  spatial  map  projects  as  input  to  the  recognition 
area,  the  output  of  which  was  utilized  in  the  formation  of 
the  spatial  map.  In  the  biological  ease,  it  is  not  clear 
whether  this  system  constantly  feeds  back  on  itself  or  if 
there  is  an  afferent  wave  of  activity  followed  by  an  effer¬ 
ent  wave  or  vice  versa.  In  the  case  of  the  computational 
model,  it  can  be  done  either  way  and  perhaps  an  investi¬ 
gation  of  this  will  lead  to  some  conclusions.  It  could  be 
that  the  loop  leads  to  oscillations  in  some  circumstances. 
If  it  does,  the  relationship  of  the  oscillations  may  he 
studied  in  light  of  recent  observations  (19). 
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Fig.  12.  Results  of  processing  test  data  with  the  DAN  after  it  has  been 
trained,  (a)  The  DAN  was  trained  with  100  passes  through  the  training 
set,  and  (b)  the  DAN  was  trained  with  1000  passes.  The  DAN  has  four 
hidden  units,  and  the  array  is  randomly  spaced. 


Fig.  13.  Conceptual  overview  of  the  model  for  cognitive  sensory  proc¬ 
essing. 
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Fig.  14.  Depiction  of  a  comprehensive  processor  that  includes  cochlear 
partitioning,  spatial  maps,  and  associative  memory. 


memory  by  creating  a  time  dependent  sensitivity.  The 
MABF  attempts  to  create  a  stimulus  partial  which  matches 
the  temporal  characteristics  of  each  band,  the  total  effect 
being  to  match  the  spectral  content  as  a  function  of  time, 
including  relative  phase  variations,  of  an  elicited  memory 
using  the  spatially  oriented  spectral-band  inputs  as  a 
basis.  Recognition  depends  upon  a  combination  of  re¬ 
sponses  across  the  bands,  i.e.,  the  temporal  qualities  of  all 
the  individual  frequency  bands  is  being  assessed  simulta¬ 
neously.  Frequency  modulation  in  the  stimulus  will  appear 
as  recognizable  temporal  variation  in  the  bands.  In  the 
training  mode,  memories  are  established  as  a  set  of 
weightings. 

The  main  function  of  the  recognizer  is  to  attend  mem¬ 
ory  according  to  the  stimulus.  Only  the  temporally  varying 
activities  of  the  attended  spatial  segments  elicit  memories, 
because  they  are  stronger.  Initially,  however,  the  system 
may  not  be  attentionally  focussed  and  the  performance  of 
the  NABF  on  composite  waveforms  becomes  important. 
In  some  cases,  individual  sounds  may  not  be  discerned 
without  intervention  of  a  thought  process. 

When  a  memory  is  elicited,  a  partial  of  the  stimulus  is 
produced  through  the  action  of  the  temporal  beamformcr, 
i.e.,  when  a  beamformer  wins  then  the  temporal  vector 
associated  with  that  memory  is  considered  a  partial  of  the 
stimulus  (a  significant  one).  This  partial  is  fed  back  to  the 
spatial  mapping  process,  resulting  in  attention  to  or  a 
focussing  upon  the  spatial  sector  from  where  the  partial 
came. 


In  the  computational  model,  the  projections  from  the 
spatial  map  are  input  to  a  MABF  process.  The  overall 
action  of  the  recognition  MABF  is  to  segment  and  iden¬ 
tify  patterns  of  temporal  activity  across  the  auditory  bands 
which  are  established  in  the  cochlea.  Each  spatial  stimu¬ 
lus  segment  is  again  segmented  temporally  according  to 


B.  Example 

This  comprehensive  processing  scheme  is  being  built 
into  the  fast  digital  implementation  that  was  discussed 
above.  The  needed  capacity  will  soon  be  supplied  by  five 
central  processing  units  (CPU’s)  operating  simultaneously. 
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Currently,  three  CPU's  are  producing  a  wavelet  filler 
bank  as  a  simple  cochlea  model  and  are  also  producing 
the  spatial  maps  for  a  small  number  of  bands  when 
running  with  two  microphones.  In  real-time  demonstra¬ 
tions  single-band  spatial  maps  are  displayed  as  a  function 
of  time. 

The  final  operation  of  this  system  will  proceed  as  fol¬ 
lows:  the  analog  input  boards  sample  the  sensor  excita¬ 
tions  simultaneously  and  pass  the  data  to  the  DSP  boards 
where  delays,  Hilbert  transforms,  and  wavelet  bandpass 
filters  are  applied  by  use  of  digital  filtering  techniques. 
Initially,  the  adaptive  selection  processes  are  in  a  state  v.r 
if  no  stimulus  of  interest  is  present,  therefore  there  is  no 
focussing.  When  a  stimulus  of  interest  appears,  it  will 
ellicit  the  generation  of  a  nearest  match  exemplar  from 
associative  memory,  the  exemplar  will  be  fed  to  the  adap¬ 
tive  process  and  focussing  will  occur.  Tfi£  exemplar  may 
change  during  the  course  of  focussing.  Of  course,  the 
associative  memory  will  continually  be  providing  exem¬ 
plars  even  when  no  stimulus  of  interest  is  present,  but 
there  will  be  no  significant  focussing. 

In  the  first  implementation  of  this  simulated  auditory 
process,  there  was  no  associative  memory  of  exemplars. 
Task  relevant  exemplars  were  made  available  in  a  sequen- 
rial  manner  from  a  stored  set  of  digitized  exemplars.  This 
may  be  appropriate  for  a  device  which  performs  a  single 
task  (like  tracking  any  object  that  is  a  member  of  a  set  of 
task-relevant  objects  that  are  relatively  distinct  from  one 
another). 
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